A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
نویسندگان
چکیده
Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This grounding of dropout in approximate Bayesian inference suggests an extension of the theoretical results, offering insights into the use of dropout with RNN models. We apply this new variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks. The new approach outperforms existing techniques, and to the best of our knowledge improves on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity). This extends our arsenal of variational tools in deep learning.
منابع مشابه
Application of artificial neural networks on drought prediction in Yazd (Central Iran)
In recent decades artificial neural networks (ANNs) have shown great ability in modeling and forecasting non-linear and non-stationary time series and in most of the cases especially in prediction of phenomena have showed very good performance. This paper presents the application of artificial neural networks to predict drought in Yazd meteorological station. In this research, different archite...
متن کاملNeuro-Optimizer: A New Artificial Intelligent Optimization Tool and Its Application for Robot Optimal Controller Design
The main objective of this paper is to introduce a new intelligent optimization technique that uses a predictioncorrectionstrategy supported by a recurrent neural network for finding a near optimal solution of a givenobjective function. Recently there have been attempts for using artificial neural networks (ANNs) in optimizationproblems and some types of ANNs such as Hopfield network and Boltzm...
متن کاملStochastic Dropout: Activation-level Dropout to Learn Better Neural Language Models
Recurrent Neural Networks are very powerful computational tools that are capable of learning many tasks across different domains. However, it is prone to overfitting and can be very difficult to regularize. Inspired by Recurrent Dropout [1] and Skip-connections [2], we describe a new and simple regularization scheme: Stochastic Dropout. It resembles the structure of recurrent dropout, but offer...
متن کاملBayesian Sparsification of Recurrent Neural Networks
Recurrent neural networks show state-of-theart results in many text analysis tasks but often require a lot of memory to store their weights. Recently proposed Sparse Variational Dropout (Molchanov et al., 2017) eliminates the majority of the weights in a feed-forward neural network without significant loss of quality. We apply this technique to sparsify recurrent neural networks. To account for...
متن کاملRecurrent Neural Network Regularization
We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include la...
متن کامل